Adapts Qwen3-Omni PD disaggregation to current config-refactor by spencerr221 · Pull Request #2947 · vllm-project/vllm-omni

spencerr221 · 2026-04-20T10:15:45Z

Purpose

This PR adapts Qwen3-Omni PD (prefill-decode) separation to the new config-refactor flow introduced by #2383, without adding a new deploy YAML.

It adds pd_separation support to the deploy-based config pipeline so the existing vllm_omni/deploy/qwen3_omni_moe.yaml can dynamically expand the original 3-stage Qwen3-Omni pipeline into a 4-stage PD layout at merge time. When enabled, the thinker stage is split into prefill and decode stages, downstream stage IDs and connectors are remapped, and KV transfer settings are injected from deploy config.

This keeps the new pipeline+deploy model intact, preserves the existing PD detection/runtime logic, and avoids introducing a separate PD-specific config file after the config refactor.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector · 2026-04-20T10:15:50Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

amy-why-3459 · 2026-04-20T10:44:47Z

Can you add PD separation performance test cases?

hsliuustc0106

please update the docs as well

hsliuustc0106

BLOCKING:

Gate Check — pre-commit is failing. Please fix pre-commit issues before proceeding with review.

hsliuustc0106

BLOCKING:

Gate Check — pre-commit still failing (end-of-file-fixer on test_qwen3_omni_expansion.py, E501 line too long in stage_config.py:757, ruff format). Please run pre-commit run --all-files locally and push the fix.

Non-blocking:

PR description Test Plan / Test Result sections are empty. Please add CI results or local run output showing the PD expansion tests pass (e.g. test_pd_disaggregation.py, test_config_factory.py::test_merge_pipeline_deploy_with_pd_disaggregation).

Gaohan123 · 2026-04-26T07:04:41Z

 if CONFIG_FILE_PATH is None:
    print(
        "No --test-config-file in argv, using default: tests/dfx/perf/tests/test_qwen_omni.json "
-        "(override with e.g. --test-config-file tests/dfx/perf/tests/test_tts.json)"


revert this

Gaohan123 · 2026-04-26T07:08:36Z

                    resources:
                      limits:
-                        nvidia.com/gpu: 2
+                        nvidia.com/gpu: 3


I suggest you add a separate label for PD test rather than all omni test with 3 cards

I suggest you add a separate label for PD test rather than all omni test with 3 cards

if you separete job for PD test, maybe you can use job name Omni · Function Test with 2 H100 and Omni · Function Test with 3 H100

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

…th mooncake. Signed-off-by: Liubingyu <liubingyu62@gmail.com>

spencerr221 requested a review from hsliuustc0106 as a code owner April 20, 2026 10:15

gcanlin reviewed Apr 20, 2026

View reviewed changes

Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated

gcanlin reviewed Apr 20, 2026

View reviewed changes

Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated

hsliuustc0106 reviewed Apr 20, 2026

View reviewed changes

amy-why-3459 reviewed Apr 21, 2026

View reviewed changes

Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated

amy-why-3459 reviewed Apr 21, 2026

View reviewed changes

Comment thread tests/e2e/online_serving/test_qwen3_omni.py Outdated

spencerr221 force-pushed the adapt_config branch from 8e7d305 to 074e448 Compare April 21, 2026 07:23

hsliuustc0106 requested changes Apr 21, 2026

View reviewed changes

spencerr221 force-pushed the adapt_config branch 3 times, most recently from fd9136c to cd5e8d0 Compare April 23, 2026 10:03

Gaohan123 added ready label to trigger buildkite CI omni-test label to trigger buildkite omni model test in nightly CI labels Apr 23, 2026

spencerr221 force-pushed the adapt_config branch from 3407e76 to 701b6ff Compare April 24, 2026 01:57

yenuo26 reviewed Apr 24, 2026

View reviewed changes

Comment thread tests/e2e/online_serving/test_qwen3_omni_expansion.py

yenuo26 reviewed Apr 24, 2026

View reviewed changes

Comment thread tests/entrypoints/test_pd_disaggregation.py

spencerr221 mentioned this pull request Apr 24, 2026

[RFC]: Support Prefill-Decode Disaggregation for vLLM-Omni Thinker Stage via vLLM KV Transfer JiusiServe/vllm-omni#92

Open

1 task

spencerr221 force-pushed the adapt_config branch from 701b6ff to 4c0eb2b Compare April 24, 2026 07:27

spencerr221 changed the title ~~adapts Qwen3-Omni PD (prefill-decode) separation to current config-refactor~~ Adapts Qwen3-Omni PD disaggregation to current config-refactor Apr 24, 2026

Gaohan123 added this to the v0.20.0 milestone Apr 24, 2026

spencerr221 force-pushed the adapt_config branch 2 times, most recently from 3fd973b to 8a7050e Compare April 26, 2026 04:17

Gaohan123 reviewed Apr 26, 2026

View reviewed changes

spencerr221 force-pushed the adapt_config branch 2 times, most recently from 316e921 to fa080f7 Compare April 27, 2026 03:58

hsliuustc0106 removed ready label to trigger buildkite CI omni-test label to trigger buildkite omni model test in nightly CI labels Apr 29, 2026

Gaohan123 removed this from the v0.20.0 milestone Apr 30, 2026

spencerr221 force-pushed the adapt_config branch from fa080f7 to db76ada Compare May 5, 2026 08:39

spencerr221 added 15 commits May 5, 2026 16:45

fix bug of parallel mode.

61db1aa

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix merge

963a643

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix merge

afe6039

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix OOM.

90c250a

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix bug of parallel mode.

bbf388d

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix bug of parallel mode.

74c900d

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

947234f

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

df80a7e

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

0f78892

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

5c6f308

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

36e4448

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

6be87f9

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix ci.

189769e

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix bug of parallel mode.

c8c0208

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

fix bug of parallel mode.

db76ada

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>

Gaohan123 added this to the v0.22.0 milestone May 11, 2026

partically fix bug of missing eos and speaker002, but remain issue wi…

a847d96

…th mooncake. Signed-off-by: Liubingyu <liubingyu62@gmail.com>

spencerr221 requested review from ZeldaHuang, congw729, david6666666, linyueqian, princepride, tzhouam and yuanheng-zhao as code owners May 12, 2026 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapts Qwen3-Omni PD disaggregation to current config-refactor#2947

Adapts Qwen3-Omni PD disaggregation to current config-refactor#2947
spencerr221 wants to merge 16 commits into
vllm-project:mainfrom
spencerr221:adapt_config

spencerr221 commented Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

amy-why-3459 commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Uh oh!

Gaohan123 Apr 26, 2026

Uh oh!

Gaohan123 Apr 26, 2026

Uh oh!

yenuo26 Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

spencerr221 commented Apr 20, 2026

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

amy-why-3459 commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Gaohan123 Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Gaohan123 Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

yenuo26 Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants